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PREFACE 

In this shoi't volume, we deal with some tests of hypothesis 
which are frequently encountered in the analysis of multivariate 
data. The type of hypothesis considered is that which can be 
answered in the negative or affirmative by the statistician (with 
certain calculable probabilities of being wrong). 

It will be recognized that this type of hypothesis covers 
but a small part of the statisticians work in the multivariate 
field. In the simplest of all cases we may be presented with a 
sequence of vector observations and asked to judge whether or not 
these vector observations could have been drawn from a population 
with vector mean If the statistician judges *no*, then almost 

inevitably he will be asked ”if the time population mean vector is 
^ which you (the statistician) say is not in which of its ele- 

ments is u different from u ?” The best the statistician can do 
— — o 

in answer to this question is to make an educated guess. For sup- 
pose the vector has m elements and p - u = e. The statistician 
has judged that the vector is not the vector of all zeros and by 
controlling his first kind of error to the usual 5%, he will be 
wrong on the average one time in twenty situations where ^ really 
is But in answer to the follow-up question he is being asked 

to pick the correct alternative of 2^-1 possible alternatives: 

that £ differs from £ only in its i-th position for some i 
that £ differs from JD only in its i-th and j-th position for some 

i and j (i^j) 
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etcetera 



that e differs from £ in all but its i-th position for some i. 

‘There are no statistical techniques which enable the statistician 
to announce (for example) differs from in its first and third 
position only and there’s a 5% chance that I’m wrong.” The heart 
of the matter lies in the theoretical impossibility of putting a 
probability of error on the statement. In connexion with this, the 
reader should consult section 11.6; page 258. 

Nevertheless, certain proceedures are available which can be 
described only as logical proceedures upon which to base an 
educated guess. It is these proceedures which are not developed 
in this volume though the reader is referred to Part III of this 
contract written by Rolf Bargmann for a detailed discussion, 
illustration and analysis of a similar type of multi-alternative 
problem and to Chapter 15 of these notes for an introduction to 
that problem. 
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10. i 

10 . 1 , 10.2 



CHAPTER 10 

THE DOOLITTLE METHOD 



10.1 Introduction 

In the process of the analysis of actual experimental data, 
it will frequently be necessary to evaluate the determinant of a 
matrix of high order; to solve a matrix equation or to invert a 
matrix of high order. The Doolittle method is well established 
as a computational technique which lends itself to the use of a 
desk calculator or electronic computer and is widely used in the 
analysis of multivariate data. The evaluation of a determinant, 
matrix invertion, and solution of the matrix equation is accom- 
plished through a series of systematic elementary mathematical 

operations • 



10,2 The forward Doolittle process . 

Given a symmetric , positive definite matrix A with known 
elements, we may wish to 

(a) evaluate 1A| 

A-1 

(b) determine A 

(c) determine 4 satisfying A* = B (B given) . 

We shall hold that A be mxm; in problem (c) above, B may be mxk 
(k<m) so that * is also mxk. It is noted that problems (b) and 
(c) are not dissimilar for if we set B=I, then *=A Vie will 
discuss problem (c) and give the methods for (a) and (b) as 

Special cases. 



Set out algebraically, the computation table has the appearance 



®11 


^12 


®13 


^14 


• • • • 


®lm 


hi 


CM 

rH 


^13 


• • * • 


hk 


^1 


A ^ 
^11 


^12 


^13 


^14 


• • • • 


a* 

‘^Im 


hi 


b* 

^12 


h3 


• • • • 


hk 


y 

^1 


1 


“12 


^13 


®14 


• • • • 


®lm 


hi 


“l 2 


h! 


* • • • 


hk 


xj* 


(aji 


®22 


^2 3 


^24 


• • • • 


® 2 m 


hi 


^22 


^23 


• • • • 


hk 


X 2 


( 0 ) 


aft 

®22 


^2 3 


a^ 

^24 


• • • • 


a* 

^ 2 m 


hft 

“21 


b* 

°22 


^23 


• • • • 


hk 


X 2 


( 0 ) 


1 


^23 


a** 

^24 


• • • • 


^ 2 m 


h! 


bSS 

22 


h§ 


• • • • 


hk 


%2 


^^31 


)(a32) 


^33 


^34 


• • • • 


^3m 


hi 


^32 


*^33 


• • • • 


hk 


^3 


( 0 ) 


( 0 ) 


^33 


^34 


• • • • 


^3m 


b* 

®31 


b* 

^32 


b* 

^33 


• • • • 


hk 


Y* 

X 3 


(0) 


(0) 


1 


^34 


• • • • 


^3m 


b** 

°31 


b** 

°32 


b** 

°33 


• • • • 


Ht. 


^3 














etcetera 














^mm 


hi 


^m 2 


m3 


• » • • 


^mk 


Xm 












a* 

mm 


hi 


l- A 

D'* ^ 

m 2 


h3 


• • • • 


b*, 

mk 


h 












a** 

mm 


Kt 


b* ' 
m 2 


h3 


• • • • 


b** 

mk 


^m 



It is noted that the first row of the table is the first row of A, 
followed by the first row of B, followed by a ’’check” entry x-|^ which 
is actually the sum of all elements (m+k of them) in the first rows 
of A and B, so that 

m k 

(10.2.1) xi = .1 aij + 

3 3 



the second row of the table is the first row repeated [in practice 
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lO.iii 

10.2 



the first row could be omitted; it is included here for symmetry]. 
This second row which is the first starred row is defined then by: 





®lj ■ *lj 


II 


(10.2.2) 


Hi - ‘“Ij 


• 

3 = 









...» 



m 



j “ 1* 2| •••• ^ 



The third row (first double-starred row) is produced by dividing 
all elements in the proceeding row by the leading element of 
that row (that is, by first row of double-starred 

elements are defined then by: 



(10.2.3) 



a|. 




j = 1. 


2, 


• • • f o 


HI 




3 = 1, 


2, 


• • • 1 













The fourth row is produced by writing down the elements of the 
second row of A followed by the elments of the second row of B, 
followed by the ’’check” entry X 2 » where 



m k 

(10.2.4) Xo = I I 

^ j=l j=l • 

In practice the element a 2 ]^ is not written in (it appears in 
brackets in the table since the elements which will appear under 
it are zero and do not really enter into the computations). The 
second starred row is computed via the equations?- 
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10 . iv 
10.2 





* *2j ■ *j2*!j 


it 

ro 


(10.2.5) 


* *=2j - 


j . 1 




Xj « X2 - aJzXi 





It is observed that « a 2 ^ - *$ 2 *?! * *21 " *12 * ° 
symmetry of A. There is no point in entering this zero in the 
table in practice. 

The second row of double-starred elements is obtained by 
dividing the preceeding row by the element a ^2 » thus 

* 2 ^ * * 2 j ^*22 ^ 

( 10 . 2 . 6 ) bJJ « j * 1 . 2 , ..., k 

Xj* * Xf/a52 

The seventh row is the third row of A, followed by the third 
row of B, followed by "check” entry X 3 defined by 

m k 

(10.2.7) ,, . 

The third starred row is obtained via the equations 



( 10 . 2 . 8 ) 



^3j = «3j-®23^2?-^13®l? 3 

= ^3j-«23'’f?-®i3'^!! 3 

A -- X3-a?3Xf-aj3xf 



3| ...| m 

1 | 2 ^ ...| k 



It can be shown that a^^ = a ^2 ® 0, so that in practice a^^ and 
3^2 and any entry below these elements are omitted from the table. 
Notice that a^^ and aj^ multipliers for all elements in 
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10.2 



in the third block, third starred row so that it is a good idea 
to pencil a circle round these two and aj^) when working 

in the third block. 

The third row of double starred elements are obtained by 
dividing the preceeding row by a^^ so that 



“3j^‘33 



( 10 . 2 . 8 ) 



- 



j ® 3, 4, , , , , m 
j “ 1, 2, ***• k 



y'h'lt 

^3 



In general the first row of the r-th block (^r-2-th row) is 

the r-th row of A, followed by the r-th row of B, followed by 

the ’’check” entry given by 

m k 

X = y a . + 7b. 



and in practice the elements 0 ^ 2 * •••♦ ®rr-l 

i* r-1 



are not entered 



since elements {a*.} 



ri 



i=l 



turn out to be zero. The r-th starred 



entries are given by 



( 10 , 2 , 10 ) 



a* . 


= a^. 




r: 


b*. 

r: 


= 


r-1 

- y a’fi' b$$ 




i=i 


X* 


= - 


r-1 

i^l^trXi 




^r 



j = r , r+1, , , , , m 



3 ~ 2, ,,,, k 



again; when working in the r-th block it is a good idea to pencil 
a circle round elements ,,,, since they occur 
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10. vi 

10.2 



in all the multiplications. The r-th double starred row is 
given by 






( 10 . 2 . 11 ) 






ri ^ ^rj^*rr 






x*/a* 



j s r, r+1 



» • • • » 



m 



3 ® 1» 2| •••» h 



The process is repeated until all rows of A are exhausted 
(producing then m blocks of three line entries). 

As a check on the computations, it is noted that 



( 10 . 2 . 12 ) 



m k 

Y** = y + y 

^ jSl j = l *’3 

m k 

= y + y b*^ 

rn ."t r3 

3=r , 3=1 



7 -r-l 

(since {a*J = O}** ) 

j=i 



This completes the so-called ’’forward” part of the Doolittle 

process and it is instructive to consider what has been done 

algebraically. It is easy to verify that in order to go from 

A = (a..) to A»' = (a’S?.), we have premultiplied A by a lower tri- 
13 ^3 

angular matrix with ones on the diagonal. Calling this matrix 
F, we have 



(10.2.13) 



F = 



21 



31 



41 



ml 



0 

1 

32 

■42 

■m2 



0 

0 



43 



m3 



0 

0 



m4 



• • • 



• • • 



• • • 



• • • 



• • • 



0 

0 



1 
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10.2 



In fact, F is the product of matrices 



( 10 . 2 . 14 ) 



F s j j j 

m m-i m-£ 



jj jj Ji 



where 




0 

0 



0 



0 0 0 0 

1 0 0 0 

0 1 0 0 

0 0 1 0 

• • • • 

• • • • 

• • • • 

0 0 0 1 



and in general 
placed by 




is the identity matrix with the r-th row re- 




. . . 



-a 



ft 

r-1 :r 






1 



» 



000. .0 



Now exactly the same operations are carried out on B as those on 
A; accordingly if we define B* = then 



( 10 . 2 . 15 ) 



FA = A* 
^ FB = B* 



or F(A;B) = (A*|B*) 



To produce the double-starred elements, we divide the r-th row 

of (AJB) by Define the diagonal matrix whose r-th diagonal 

element is a* by D 

' r r 



then 
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(10.2.16) 



D“^A* = D“^FA s A** 
D“^B* s D“^FB s B** 



10.3 The solution of A»sB . 

We are new in a position to determine ♦ by simple arith- 
metic operations. Since 

(10.3.1) A* = B, 

then if F be the forward Doolittle process on A carrying B, then 

(10.3.2) D“^FA* s D"^FB 



or 



(10.3.3) 



A*** s B** 



The solution for ♦ is simple by virtue of the fact that A** is 



(10.3.4) 



1 

0 



^12 



0 0 
0 0 



a 

a 

• 

0 

0 



V/riting 


equation 


ft* ... 


a* ft 


aft ft 


13 ••• 


^lm-1 


®lm 


ft* 


aft ft 


aftft 


23 ••• 


^2m-l 


®2m 



• • • 



1 

0 



a**T 

m-1 ;m 



11 



21 



m-1 : 1 



ml 



12 



22 



• • • 



♦ik 

♦2k 



m-l:2 ••• ♦m-l!k 



m2 



... 4* 



mk 



^21 

Ktui 

Kt 



• • • 



• • • 



b** 
^^12 

b** 
^^22 



b** 

m2 



... 



°lk 
°2k 

b**T , 

m-1 ;k 

b** 
mk 
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10. ix 
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Multiplying the last row of A** into ♦ we have 

(10.3.5) ... ♦^j,) * (b** b** ... b*j*) . 

Multiplying the penultimate row of A** into ♦ we have 

(10.3.6) ^V-l:l ^m-l:2 *•* '^m-l!k^ ^m^l:m^V:l V:2*** V-l:k' 

= Ktu2 ••• 



SO that {6 T.*} are quickly determined. 

^m-i:3 1 

Similarly for all rows of ♦ ending with the first row of ♦. An 
example is given in section 10.5. 



10.4 The determinant and inverse of A . 

Returning to equation (10.2.15) y we have, after taking 
determinants of left- and right-hand sides, 



(10.4.1) 



1F| |A 



A* 



but |f| = 1 and |A*1 = aj^ a ^2 ••• "therefore 



(10 


CM 

• 

• 






1A| 


- 


aJl a 


L* 

22 


• • • 3 


L* 

mm 


• 












It 


is important 


also 


to 


observe 


that 


if 


FA = A*, then 










































^ 1 


0 


0 


. . . 0 




®ii 


®12 


^13 


• • • 


®lr 




ft 

^11 


a* 2 


^23 


A ft 

... aj^p 




^21 


1 


0 


... 0 




®21 


®22 


®23 


• • • 


®2r 




0 


a* 2 


^23 


. . . a^^ 




^31 

• 

• 


^32 

• 

• 


1 

• 

• 


. . . 0 

• 

• 

A 




®31 

• 

• 

A 


®32 

• 

• 

* 


®3" 

• 

• 

* 


• • • 


®3r 

• 

• 

• 


S 


0 

• 

• 


0 

• 

• 

• 


^33 

• 

• 

• 


. . . a§^ 

• 

• 

• 




^rl 


<2 


fr3 


... i 




®rl 


• 

• 

®r2 


^r3 


• • • 


• 

• 

rr 




• 

• 

0 


• 

• 

0 


• 

• 

0 


• 

• 

... a^„ 

ri^ 
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so that 
(10.4.3) 



10 .X 
10.4 




®12 

®22 




• • • 



• • • 






• • • 







This immediately yields another important result. Let A be 
part itioned: 



(10.4.4) 



A = 



^11 j ^12 
;^21 i ^22 



with A^^ and A 22 P^p and qxq respectively, where p+q = m. Then 
(by clockwise rule) 

(10.4.5) 1a| = I ^ 11 1 I ^22*^21^11^12 1 * 

Using (10.4.2) and (10.4.3) 

(10.4.6) 1^22*^21^11^121 “ ^r+l:r+l ^r+2:r+2 *** ^mm * 



Turning now to the inverse of A, we carry I (mxm) rather than 
general B. In our notation, we have: 



(10.4.7) 




f(a:d = (A*:i*) 

• • 

D*^(A*|I*) = (A»'^*|I**) 






Notice that 
(10.4.8) 




I* = 

j** 



F 




1 



F 
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Now, postmultiplying the first of equations (10,2,15) by F* , 
we have 

(10.4.9) FAF* = A*F* . 

Since A* and F* are both upper triangular, then so also is their 
product; but FAF* (sA*F*) is symmetric. Since A*F* is upper 
triangular and symmetric, it must be diagonal. Clearly 
(A*F*)^^ = a*j^, therefore 

(10.4.10) FAF* = A*F* = D , 

so that 

(10.4.11) A"^ = F'D“^F 

= (I*)'(I**) , 

SO that 

(10.4.12) 

that is the (a,6)-th element of A"^ is the inner product of the 
a-th column of I* with the 3-th column of I**, An example is 
given in the next section, 

10,5 Examples of the problems discussed in section 1 . 

Problem 1 

Let it be required to solve for ♦ from the equation A* » B 




when 
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Having performed the forward Doolittle process, the equation 
A# = B becomes = B** where 



II 

< 


1 


.00000000 1. 


47368421 


1.94736842 


1.31578947 


o.oooooood" 






zero 1. 


00000000 


0.00000000 


1.31578947 


0.00000000 






zero 


zero 


1.00000000 


1.00000000 


1.00000000 






zero 


zero 


zero 


1.00000000 


1.00000000 






zero 


zero 


zero 


zero 


1.00000000 


B** = 




9.47368421 


6.89473684 












2.63157895 


3.94736842 












7.00000000 


6.00000000 












4.00000000 


5.00000000 












2.00000000 


2.00000000 








Vie have 


immediately 
















= 2.00000000 














If,, = 2.00000000 














if^j^ = 4.OOOCOOO0-(l. 


00000000)(2. 


00000000) = 2 


.00000000 



= 5. 00000000-(l. 00000000X2. 00000000) = 3,00000000 



<|>3^ = 7. 00000000-(1.000000p0)(2. 00000000) 

-d.00000000)(2. 00000000) = 3.00000000 

<|>32 6. 00000000-(1.00000000)(3. 00000000) 

-d.00000000)(2. 00000000) = 1.00000000 



and similarly 
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= 0.00000000 
(|>22 = 0.00000000 

(j)^^ = 1.00000000 

( j )^2 = 1.00000000 

Problem II 

It is required to evaluate the determinant of matrix A 
when A is 



1 


2 


3 


4 


5 


2 


5 


4 


4 


10 


3 


4 


15 


24 


17 


4 


4 


24 


45 


44 


5 


10 


17 


44 


110 



(this matrix has been chosen so that the forward Doolittle 
can be performed without the need of a desk calculator). 

The computations proceed: 
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and 




= T T a’S'. = Ixlx2x5x3 = 30 • 

i n H 



i = l 



Problem III 



It is required to obtain the inverse of A, symmetric, when 



A = 



% 



3 110 0 

1 5-12 3 

1-1 3 C 0 

0 2 0 3 1 

0 3 0 1 12 
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(Here again, for demonstration purposes, A has been chosen 
so that the Doolittle proceedure does not require the use 
of a desk calculator). 



The required table develops as follows: 






— A 


► 






-I — 








X 


1 1 


1 0 


0 


1 


0 


0 


0 


0 




4 


1 1 


1 0 


0 


1 


0 


0 


0 


0 




4 


1 1 


1 0 


0 


1 


0 


0 


0 


0 




4 


5 


-1 2 


3 


0 


1 


0 


0 


0 




11 


4 


-2 2 


3 


-1 


1 


0 


0 


0 




7 


1 - 


1/2 1/2 


3/4 


-1/4 


1/4 


0 


0 


0 




7/4 




3 0 


0 


0 


0 


1 


0 


0 




4 




1 1 


3/2 


-3/2 


1/2 


1 


0 


0 




7/2 




1 1 


3/2 


-3/2 


1/2 


1 


0 


0 




7/2 




3 


1 


0 


0 


0 


1 


0 




7 




1 


-2 


2 


-1 


-1 


1 


0 




0 




1 


-2 


2 


-1 


-1 


1 


0 




0 






10 


0 


0 


0 


0 


1 




15 






3/2 


7 


-7/2 - 


7/2 


2 


1 




9/2 






1 


14/3 


-7/3 - 


7/3 


4/3 


2/3 




3 


and the 


values of 


are 


obtained 


(equation 


10.4.17) 




directly 


from the 


entries under ”1”; thus 










> 

1 


= (1)(1) + 


(-!)( 


•i) - 




+ (2)(2) + 


(7)(^) 


482 

= TT 


(A ^)j_2 


= (1)(0) + 


(-l)(i 


< 


-|)(|) + 


(2)( 


-1) + 


(7)(-|) 


232 

- "Tr 



etcetera 
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-1 



# 






241 


-116 


-119 


68 


28 


-116 


58 


58 


-34 


-14 


-119 


58 


61 


-34 


-14 


68 


-34 


-34 


22 


8 


28 


-14 


-14 


8 


4 



It is advisable as a final check to perform the product AA 
to make sure the identity matrix does indeed result, 

10,6 The computation of B*A~^B, (A and B given) . 

Perform the forward Doolittle on A carrying B: 



-1 



F(aIb) = (A*‘B*) 

( 10 , 6 , 1 ) 





D“^(A*;B*) = (A**;B**) 


Now since A”^ = 


F'D"^F (equation 10.4.16) 


(10.6.2) 


B'A"^B = B'F'D'^FB 




= (FB)'D"^FB = (B*)'(B**) 


so that 




(10.6.4) 


(B'A"^B)„. = ((B*) „)’((B**) „) 

(X p • u • P 



that is, the (a,3)-th element of B*A“^B is the inner product 
of the a-th column of B* with the 3-th column of B** , 



Problem IV 

Obtain the value of the quadratic 



form x*A 



-1 



X when 
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1 2 

2 9 

A= 3 6 

3 9 

3 0 

X* = (1 0 4 2 



3 

6 

14 

10 

10 



3 

9 

10 

12 

12 



3 

0 

10 

12 

18 



3). 



The required table develops as follows: 
^ A ^ X 




The value of x*A“^x is, from the entries under x, (1)(1) + 

(-2)(-0,4) + (1)(0,2) + (OHO) = (2.6)i^^ = 21 ^ = 1.828252 

(3936) 3936 

(=x*A“^x) , 
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CHAPTER 11 

TESTS OF SIMPLE LINEAR HYPOTHESES 



11.1 Introduction 

In this chapter we deal with hypotheses which in the uni- 
variate case would lead to a use of the student t-distribution. 

In its most general statement we have, say, g groups or popula- 
tions and n^ vector observations from the i-th group. Our 
observations will be designated ( j=l,2 , . . . ,n^; i=l,2,...,g) 

and it will be assumed 

(11.1.1) x^^ 

the N = f n. vectors Cx..} being mutually independent, 
irl ^ 

The most general simple hypothesis is: 

(11.1.2) H : 3, w, +3«w«+. . = u 

O 1—1 2—2 g“"g “O 

g 

where {3.} are a set of specified scalar constants and where 
^ 1 

is a specified mxl vector of constants (often the vector of 
zeros) . 

The hypothesis expressed in (11.1.2) includes the following 
more familiar cases: 



(a) 


(one group) 4. = vj^ 


(g=l;6j^=l) 


(b) 


(two groups) = 1^2 


(g=2;Bj^=l;B2 = -l;u^=0) 



and actually includes the regression model but this will be 
treated separately. 
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11.2 The test statistic for the general simple linear hypothesis . 

Given the observations described in the introduction, that 
is, given 



( 11 . 2 . 1 ) 



x^. /^N(i^^;V) 



j = 1, 2, ..., n. 
i “ If 2, ..., g 



it is required to test 



( 11 . 2 . 2 ) 



= \x 

i = l 



g 



for specified {g.} and u . 



Method: 



Obtain the sample product— cross product matrix for the i-th 



group 

(11.2.3) 



where 



and construct 



n . 

1 



C • = "* — i ^ in *" — i ^ 

1 j=l 



n . 

1 



= -Ii - "iSi.Si. 

n . 

X- = -i- J X.. 

i=l 



(11.2.4) 



C ~ Crt ... ^ c 

12 g 



so that C is an unbiassed estimate of (N-g)V. 



Clearly 



(11.2.5) 



/"i 

d = | y 3.x. I- 
- ^ 1=1 



ilo 
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is a measure of departure 

from having £ expectation if be true. The statistic d is 
actually a vector randum variable having the distribution 



which has expectation I ^ ^i^i 



( 11 . 2 . 6 ) 



d/-NN^(l6iU. 




( 6 .) 



2 

-) V) 



so that "standardization to V" is effected by dividing d by 

The test criterion becomes 



(11.2.7) 



= (1(6? / n^))"^ d'C"^d 



and the critical region of size o is given by 



( 11 . 2 . 8 ) 



> 



m 

N-g+l-m 



p(ot) 

^m: N-g+l-m 



^mtLg+l-m t*'® ^m: N-g+l-m which 

cuts off 100 a% above. is computed using the Doolittle 

proceedure (see Section 10). 



11.3 Simple test on the mean of a single population 

In the development of Section 11.2, we replace g by 1, 
set 3-j^ = 1, and observe 

(11.3.1) 3 = 1* •••• n. 

It is required to test II : u = ji • T he t est criterion becomes 




(11.3.2) 



- lio^ 
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where 



n 

(11.3.3) C = ^ (x. - “)(x^ - X ) * 

j_l — 3 — —3 — 




nxx* 




is computed using the Doolittle proceedure (see Section 10). 
The critical region of size a is given by 



(11.3.4) 




m 

„ ^ ^m:n-m 
n-m 



11.4 A comparison of the means of two populations . 

In the development of Section 11.2, we replace g by 2 and 

set 6, =1; 6o = “1> = 0 observe 

1 i —o •— 



(11.4.1) 



, iij /^Nj^CPl’.V) j = 1, 

i 



n 



1 







It is required to test 

standardizing factor is i 
becomes 



+ 



£l = ]J.2 

"2 ' "l"2 



* Hi ■ £2 " 

so that the test criterion 



(11.4.2) 




" l '^"2 - - -1 

-1 -2 





where 
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(11.4.3) 





"l 






^2 


c = 






-X ) * 

-I.'' 


* .V-2j 

3=1 J 






"2 










1 X 

j=l 


'2j-2j 


“ ^1-1. 2£i. 




"i / 








^1. 




"1 > 


-2. “ 


ii-»/ 



2 - 2 . - 2 . 



✓ 



is computed using the Doolittle proceedures (see Section 10). 
The critical region of size o* is given by 



(11.4.4) 






m 



.(a) 



r. 4 .« 1 rr, Him, +n«-l-m 

n2^+n2*-l-in 1 2 



11.5 An example; Testing a relationship between elements within 

the mean vector . 

A task is performed on each of five successive days by 12 
individuals (these individuals forming a fairly homogeneous group). 
It is noticeable that the "time to complete the task" decreases 
with each day due, most likely, to the experience gained. It is 
required to test the hypothesis that the time to complete the 
task is decreasing at a constant rate for each individual (the 
rate almost certainly differs for different individuals). The 
data are recorded below; 
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Time To Complete Task (Minutes) 



Individual 


Day 1 


Day 2 


Day 3 


Day 4 


Day 5 


1 


10,6 


9.7 


8.4 


7.4 


6.3 


2 


8.3 


8.1 


8.2 


8.3 


7.6 


3 


8.5 


7.7 


7.5 


6.9 


6.4 


4 


9.0 


8.9 


8.7 


8.1 


7.6 


5 


13.3 


12.6 


11.5 


10.5 


9.3 


6 


17.0 


14.8 


12.3 


10.3 


7.8 


7 


8.0 


8.2 


7.8 


7.4 


7.4 


8 


12.1 


12.0 


11.4 


10.6 


10.5 


9 


15.0 


13.8 


12.7 


11.9 


10.6 


10 


13.8 


12.1 


10.9 


9.0 


7.0 


11 


13.7 


13.3 


12.4 


11.7 


10.9 


12 


11.3 


10.6 


9.1 


8.0 


6.7 



The data collected is say t = 1, 2 , •••9 12, with 
the five entry vector for individual t. The mean vector 

should, under H^, satisfy 



^o* ^lt“^2t ~ ^2t“^3t ~ ^3t"^4t ~ ^4t“^5t 

or, alternatively 



(all t) 




•'It - 2l'2t * '^2t = ° 
*'2t - 2‘'3t * %t " 
•'at - ^Nt ''St = 



0 
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Accordingly, to test K^, we construct 





^It 




yit - 2yjt * y^r 


II 


^2t 


II 


^2t - 2y3t + y4t 




_^3t_ 




J3t - 2y4t + ^5t 



and 


if II be 
o 


true , 


then ( 


X * s 

-1 


(-0.4 


+ 0.3 


-0.1) 


X * s 

-2 


( + 0.3 


+ 0.0 


-0.8) 


X* = 

-3 


( + 0.6 


-0.4 


+ 0.1) 


X * s 

i4 


(-0.1 


-0.4 


+ 0.1) 


II 

•-LO 


(-0.4 


+ 0.1 


-0.2) 


X * r 

-6 


(-0.3 


+ 0.5 


-0.5) 


II 


(-0.6 


+ 0.0 


+ 0.4) 


X * — 

-8 


(-0.5 


-0.2 


+ 0.7) 


CD -• 
II 


( + 0.1 


+ 0.3 


-0.5) 


II 

O 


( + 0.5 


-0.7 


-0.1) 


!-• 

II 


(-0.5 


+ 0.2 


-0.1) 


ls> 

II 


(-0.8 


+ 0.4 


-0.2) 


thus 


(x - u ) 


II 


(-2.1 


and C = 


- 12^' 






The vector x. are 



+ 0.1 - 1 . 2 ) 



+2.63 -1.15 
■1.15 +1.49 
•0.40 -0.70 




(- 2.1 + 0.1 - 1 . 2 ) 
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11 



1 

T7 



+27,15 -13,59,, -7,32 

-13,59 +17,87 -8,28 

-7,32 -8,28 +21,60 



The test criterion (see 1,3,2) is then 





= (-2,1 +0,1 -1.2) /+27,15 -13,59 

-13,59 +17,87 -8,28 

-7,32 -8,28 +21,60 

(note that the factors of 12 and 1/12 cancel so that it is 
unnecessary, and inadvisable computationally to divide by 12 

where indicated to obtain c^>. 

The quantity is obtained by a Doolittle process as 

indicated below 



Doolittle Proceedure 



^ 1 






1 9 r v*-u 'i 


Check 








•• —o 


+12.15 


-13,59 


-7,32 


-2,1 


+ 4.14 


+27.15 


-13,59 


-7,32 


-2,1 


+ 4.14 


1. 


-0,50055248 


-0,26961325 


-0,07734806 


+0.15248618 




+17,87 


-8,28 


+ 0,1 


-3.90 




+11,06749180 


-11, 94400440 


-0,95116013 


-1.82771281 




1. 


-1,07920062 


-0,08594179 


-0.16514245 






+21,60 


-1.2 


+ 4.8 






+6,73641124 


-2.79268033 


+3.94373014 






1.. 


-0.41456500 


+0.58543488 



= 




(-2, l)(-0, 07734806) + (-0, 95116013)(-0, 08594179) 
+ (-2, 79268033)(-0, 41456500) = 1.3978, 
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Critical regions for 




size 0.001 


• 

• 


4.63 


size 0.005 


• 

• 


2.91 


size 0.01 


• 

• 


2.33 


size 0.025 


• 

• 


1.69 


size 0.05 


• 

• 


1.29 
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Th^'^typothesis of a linearly decreasing time to complete task 
over the five days would be rejected at a 5% level. 

I't is interesting to observe that had the test been per- 
formed on any consecutive three days only (which would lead to 
the following t -test criteria 



Days If 2, 3 






27,15 

T5xTT 



= 1.787 



refer to F 



1:11 



Days 2,3,4 



12 ( 5 ]^) 

17.87 

1^X11 



= 0.006 



refer to F 



1:11 



Days 3,4,5 



12(^t^) 



= 0.733 



21.60 

TmT 



refer to ) 



none of the results would show as significant at the 5% level. 
The three ratios are of course correlated so that it 

would be difficult to give a significance level to any one in 
the presence of the other two; nevertheless, the value 
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is 4.84 which greatly exceeds any of the three values. 

Considering the significance of against the non- 
significance of the three f’^.'ll in the context of the 

data, we are not too surprised. A better straight line fit 
to three days in inevitably available as against a fit to five 
days. Considered out of context though, the data makes an 
interesting point: 

12 

Suppose the 12 vector observations (x.) are available 

1 

from the normal trivariate density; it is required to test 

(II^) that the mean vector of the normal density is a vec- 
tor of zeros. The value of the test criterion is 

1.3978 which is significant at between the 2.5% and 5% 

level so that II is rejected. Three separate tests per- 

12 

formed on the component elements of {x.} , however, prove 

1 

non-significant individually. 

This kind of situation is illustrated diagrametrically in the 
next section. 



11.6 A diagramatic illustration of the comparison of two 
population means in the bivariate case . 

The following diagram is a dot diagram of two bivariate 
populations 

x: represents a member of population 1 
o: represents a member of population 2. 
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L 







-h-f 



4-M4H-H- 



1 i-m- M i II - Il f fHHH- 



• 



D- PQpilr : Xi-UaAwlC^, 



-Pf 
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A glance at the bivariate diagram is sufficient to indicate 
that the two populations are different in respect of their 
mean vector. All (or nearly all) the x-points lie inside an 
ellipse essentially above the line OL, nearly all the o-points 
lie inside an ellipse essentially below the line OL. Even 
without resort to multivariate techniques and an appropriate 
test, we would be prepared to pronounce the two populations 
different. If we tried to assess the populations in respect 
of one measurement alone (x^, say), then we would be looking 
at the projection of all points (x’s and o’s) onto the abscissa 
of the diagram. It is noted that there is considerable inter- 
mingling of the projected points and one would be hard put to 
decide whether these points came from a population with the 
same (x^) or not. Similar remarks apply to the projection 

onto the ordinate of the diagram when the values of ^(^ 2 ) 
are being compared. It is clear then that the bivariate com- 
parison is much more decisive than two univariate comparisons. 
Of course, the points have been chosen to illustrate the advan- 
tage of a bivariate test (in general, a multivariate test) as 
dramatically as possible; nevertheless, in cases where and 
^2 vector means for two m-variate populations, are such that 
corresponding elements of and ^2 differ by only small quan- 
tities, then it may well happen that each of the m univariate 
tests would individually prove non-significant (using the stand 
ard t-test), whereas a multivariate test (Hotelling’s) would 
give a significant result. 
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CHAPTER 12 
ANALYSIS OF DESIGN 

12.1 Introduction 

V?e deal now with the analysis of data which are collected 
from a ’’designed experiment,” In this category would be included 
the latin square; randomized block; cross-classifications; fac- 
torial designs; and so forth. The analysis is only slightly more 
complicated in the multivariate case than in the univariate case. 
This slight complication arises from the fact that the percentage 
points of the F-distribution provide the critical regions for the 
test in the univariate situation where as in the multivariate 
situation no such handy table of percentage points exist. The 
level of significance of any given value of the test criterion 
in the multivariate case is quite quickly established, however; 
the formula being given in a later section, 

I 

It is assumed that the reader has a familiarity with uni- 
variate proceedure and so only one situation will be developed; 
this should suffice to demonstrate that the multivariate pro- 
ceedure is a very simple modification of the univariate proceedure. 
We choose to investigate the two-way cross classification of 
which the randomized block is a well known example, 

12.2 The two-way cross-classification; algebraic development . 

In our general design, let us assume r rows (blocks for 

example) and c columr/^*WlJ:’^^Sftn1ent?^*^rhaps) , Each vector of 



1 
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correlated observations x is classifiable according to a row and 
a column. Supposing replicated vector observations (n per cell) 
then we shall write 2iijk k-th replicate in the i-th row, 

j-th column (i = 1, 2, ..., r; j = 1, 2, ..., c; k = 1, 2, . . . ,n) 
The observations within a given cell will have the same 
mean vector which will, however, vary from cell to cell. Accord- 
ingly, our most general supposition would be; 






^i = 1, 2, ..., 
= 1 , 2 , .••, 



r 

c . 



jk^ ’ ^ij 

Our first test (possible only if n>l), sometimes called the ’’test 
of additivity”, alternatively called ’’test for zero interactions” 
can be stated algebraically as 



«o= Xij = li * li * ii 



Ci. = 1, 2, •.., 
= 1 , 2 , ..., 



r 

c 



where y, {o.} and {6.} are unknown. The hypothesis H expresses 
- -1 1 -3 1 o 

the belief that the difference in expected ’’yield” (x) for any 
two cells in the same row is a function only of the column num- 
ber of those cells and not of the common row number; that is 

Xi-i “ Xig independent of i. 

The multivariate analysis table is; 



Source 



Matrix 



Degrees of Freedom 
Between rows nc J(x. - x )(x. - x )* r -1 



Between columns nr 



y (x . -X ) (x . ) * 

J j • n • ••• • • 



\c-l 

— J 



Interaction nTTCx. . -x. -x . +x Mx.. -x. x . +x (r-D(c-l) 

• • ^“1 J • ^“1 « • J • »*X • • J • • • 



Error 

Total 



- X. . )(x. - X. . )* 

-13 k -13. -13 K -13. 

ni(x. - X )(x. - X )* 

-13k -. .. -13k -. .. 



rc(n-l) 

nrc-1 
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The similarity of this table with the univariate table is 
strikingly obvious. 

Designating the matrices as they occur in the table by 
M j and respectively then 

+ Mj + Mj, = 

To test tor additivity; 



Construct 




|HeI 

I Me+M J I 



The use of the Doolittle proceedure is valuable in the computa- 
tion of |Mp.| and Small values of are significant of 

a contradiction to a hypothesis of additivity. In fact, if t is 
an experimental value of then it can be shown that, with 



Pq = rc(n-l) - ^(m+r+c-rc) 



4 Spy 



then 



IT. = 2lii:zl2^£zlic3m^+3(r-l)^(c-l)^+10m^(r-l)^(c-l)^ 

1920p^ 2 2 2 

° -50m^-50(r-l)^(c-l)^+159] 



Pr(-p^ log 



L ^ 



) = 



^1 - "2 - % ^ ¥ 1 ^ yr-l)(cl) i 

+ (itj - " 2 ^ ^’^*y(r-l)(c-l)+4 - 

+ (’>4 + ¥¥ ^’^^’‘mCr-lXo-D + S - 
+ a term of order , 
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It is noted that "^2 usually quite small and it is 

frequently only necessary to use 

• Pr(-Po l°e i 

« 

12.3 An example in th e test of additivity . 

The following example relates to the performance of a group 
of retarded children. Any child can be categorized according to 
his I,Q, (broad ^roup categories are used in our example) and 
according to the type of school attended. Three I,Q, broad cate- 
gories are considered 

Q^: I,Q, of 60 or less 

Q 2 : I.Q. of 75 or less, but more than 60 

Q^: I.Q, in excess of 75 , 

The types of school attended were three 

S^: public school (i.e, attended by the entire 

spectrum of I.Q, Vs.) 

S 2 S special schools (i.e, attended by the somewhat 

slower child, however, individual 
attention is not the teaching 
method) 

Sg! special schools (i.e, attended by the slower child, 

. the emphasis on individual attention). 

Five children were examined in each of the nine groups and judged 
on four tests; arithmetic, vocabulary, general science, construc- 
tive aptitude. The vectors listed are the scores on the tests, 
top-fo-bottom in the order given above, [The scores themselves 

t 
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have been adjusted so that, in each test, the average score in 
the (Q 2 fS 2 )-gr^oup is approximately 75; this helps in getting a 
feel for the data, for example we observe that the second line 
in the (Q^ ,S^)-group is in the 80*s— higher than ’’average” and 
the fourth line of the (Q^ ,S 3 )-group is rather higher also than 
the rest^] 



TABLE OF OBSERVATIONS 





TYl 

Si 


?E OF SCHOOL ATTEND 


lED 

S3 


I. 

Qi. 

c 

L 


’71 

74 

78 

67j 


73 

78 

79 
74| 


f69” 

'74 

71 

ul 


69 

70 
73 
_70_ 




70 

77 

76 

70_ 






72 

72 

70 

81 




76 

74 

70 

78 


72 

71 

78 

77 


74 

71 
76 

72 


76 

72 

76 

31 






74 

70 

78 

86 


81 

74 

78 

82 




8C 

7i 

7i 

3 i 


80 

75 

72 

88 


81 

72 

75 

88 




A 

S 

S 

I Qo 
F ^ 
I 


65‘ 

76 

78 

JZ 


65 

76 

77 
68 


70 

79 

79 

66 


69 

77 

76 

71 




72 

79 

76 

71 






70 

74 

77 

7Q 




72 

73 
72 
711 


76 

73 

78 

73 


71 

74 

76 

77 


7§ 

77 

75 

73 






77 

73 

73 

82 


79 

74 

74 

81 




76 

70 

79 

86 


78 

70 

77 

52I 


83 
74 
70 

84 




c — 

A 

T 

0 

N 


72 

83 

74 

67 


74 

82 

79 

6^ 


69 

82 

76 

72 


74 
84 

75 
67 




72 
78 
75 

73 






75 
73 
78 

76 




72“ 

78 

73 

_8l 


« 

73 ' 

78 

71 

76 


7^ 

82 

74 

78 


77 

81 

71 

74 






7§ 

79 

75 

76 


77 

73 

76 




86 

79 

78 

82 


78 

77 

73 

..32J 


82“ 

77 

79 

75 





*VJithin a given cell there are five replicates of (4x1) -column 



vectors. 
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Table of Sums 





Si 


S 2 


S 3 




’ 349 ' 




I 70 




’” 39 ?’ 


n 


373 




360 




370 


^1 


377 




370 




382 




_353_ 




_38 9_ 




__4 33_ 




341 




367 




393 


n 


387 




371 




•361 


^2 


386 




378 




373 




348 




J^!L 




_41»^ 




" 361 ” 




375 




402 


^3 


409 




392 




385 


379 




367 




381 




348 




385 




398 




rosT 




riiT 




T 19 T 


Sub 


1169 




1123 




1116 


Totals 


1142 




1115 




1136 




104£ 




1138 




1245 



Before computing tables of cross-products, we 
every element (for computational convenience) 
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Sub Total 

1115 

1103 

1129 

_U75_ 

1101 

1119 

1137 
1126_ 

1138 
1186 
1127 
1131 



33SF 

3408 

3393 

3432 



subtract 60 from 



\ 
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Table of VJithin Cell Cross Products 








S 


1 






S 


2 




483 


620 


764 


515 




996 


846 


976 


1248 


620 


1105 


1150 


788 




846 


726 


826 


1075 


764 


1150 


1231 


814 




976 


826 


1036 


1224 


515 


■ 788 


814 


589 




1248 


1075 


1224 


1639 


















— 


375 


731 


706 


391 




945 


964 


1048 


861 


731 


1523 


1497 


830 




964 


1019 


1107 


911 


706 


1497 


1486 


818 




1048 


1107 


1238 


1003 


391 


830 


818 


486 




861 


911 


1003 


848 


761 


1334 


968 


572 




1151 


1398 


1008 


1262 


1334 


2397 


1722 


1025 




1398 


1742 


1205 


1564 


968 


1722 


1263 


761 




1008 


1205 


931 


1143 


572 


1035 


761 


492 




1262 


1564 


1143 


1473 



1878 1366 1565 2554 
1366 1026 1153 1875 
1565 1153 1378 2171 
2554 1875 2171 35^ 

1759 1149 1327 2121 
1149 761 865 1386 
1327 865 1115 1669 
2121 1386 1669 2 610 

2134 1756 1677 1993 
1756 1469 1379 1650 
1677 1379 1335 1575 
1993 1650 1575 1978 



Matrix of total variation (M^) [degrees of freedom: 44] 



1 



43974 

■ 5652 

■ 1467 
39537 



5652 

28296 

36 

■ 18576 



■ 1467 
36 

15336 

.4266 



39537 

-18576 

-4266 

79956 



Matrix of Error variation (Mp) [degrees of freedom: 36] 



1 



1324 

107 

3 

• 167 



107 
1110 
■ 174 
■55 



3 

■174 
1412 
• 194 



-167 

-55 

-194 

1502 
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Matrix of column (schooling) variation CM^) [degrees of freedom: 2] 





29562 


-10779 


-828 


41322 


1 

FT 


-10779 


4974 


1413 


-15231 


-828 


1413 


1206 


1332 




41322 


-15231 


1332 


57786 



Matrix of Row (I.Q.) variation CMj^) [degrees of freedom: 2] 





2094 


4164 


-528 


-141 


1 


4164 


11634 


-708 


-4101 




-528 


-708 


4 

168 


-354 




-141 


-4101 


-354 


4362 



Matrix of interaction variation (Mj) [degrees of freedom: 4] 

[obtained by subtraction] 



402 


0 


-138 


-3147 


0 


1698 


897 


811 


-138 


897 


1254 


-3498 


-:i47 


811 


-3498 


4290 



Before preceding to the construction of the test statistic it is 
interesting to compare the matrices M^, and i M^ in respect of 
their diagonal terms; this would correspond to four individual 
tests of additivity of effects on aritheir. tic score 
additivity of effects on vocabulary score 
additivity of effects on general science score 
additivity of effects on constructive attitude score. 



mr- 17 n 402 1698 . 1254 4290 

The F ratios are respectively: > TTITr ’ T^TIT ’ X ' 5 " 0 ' 2 ' 



i 
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Only the last of these is sifnificant at the 5% level when considered 
as an. individual test. 



The next step ic to perforr a Doolittle on Mj, and on "to 

obtain the deteminants of each. 



Doolittle on 5M 



Check 



1324 


107 


3 


-167 


1267 


1324 


107 


3 


-167 


1267 


1 


0.080815 


0.002265 


-0.126132 


0.956948 


* 


1110 


-174 


-55 


988 




1101.3528 


-174.2424 


-41.5039 


885.6066 




1 


-0.158207 


-0.037684 


0.804108 






1412 


-194 


1047 






1384.4268 


-200.1878 


1184.2389 






1 


-0.144599 


0.855400 




• 




1502 










1450.425'. 






(1324)(1101.3528) (1384. 4268) (1450. 4250)/ 5 






0.468489 X 10 



10 
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Doolittle gn 45(Mj,+Mj) 






12318 

12318 

1 



963 

963 

0.078178 



-111 

-111 

-0.009011 



-4650 

-4650 

-0.377496 



Check 

8520 

8520 

0.691670 



11688 

11612.715 

1 



T 



-669 

-660.322 

-0.056861 



316 

679.529 

0.058515 



12298 

11631.922 

1.001653 



13962 

13923.453 

1 * 



-5244 

-5247.263 

-0.376865 



7938 

8676.189 

0.623134 



17808 

14035.371 



= (12318)(11612. 715X13923. 453X14035. 371)y/(45) 



Finally 



= 0.68170 X 10 



-7^ |MeI 



10 



- = 0.68724 






To establish the significance level of this experimental value 
we require Pr{^^^ 0.68672 }. Now m=4; r=c=3; n=5 so that 



= 36 - -|.(4 + 3 + 3-9) = 35^5 

TT 42 + 22.2^.5) = 0.007141 

48p^ 

TT = •(3«4^ + 3* 2^*2^ + 10*4^«2^*2^-50*h2-50*2^*2^ + 159) 

^ 1920P 

o 

= 0.87066 X 10“^ 



1 



r 
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Clearly and can be neglected in the formula for the signifi- 



2 "4 

cant level. Ue have then 



Pr{ 



<^< 0 - 



68724) = Pr{-lor 



iO- 



3758) 



= Pr{-35^.1oCg ^ 13.34} 

= (0.992 859)Pr{Xi6il3.34) + (0.007141)Pr{x2t;L''3 



= 0.649 



A value of less than or equal to the observed value of 0.68672 
could be obtained approximately 65% of occassions by chance varia- 
tion; there is no evidence to contradict the hypothesis that the 



effects are additive. 

Since the so-called interaction term is not significantly 

different from zero, it would be meaningful to test the main effects. 

Suppose it is required to test "i*hat only the second (vocabulary) 

and fourth (constructive aptitude) scores differ with I.Q. level. 

A check on diagonal elements of row matrix versus error matrix 

suggests that the least contribution is due to the third score; 

the next to the first score, the next to the fourth score and the 

most contribution from the second score. VJe reorder our elements 

within the vector to conform with this contribution order. The 

listing of scores now becomes 

general science 
arithmetic 

constructive aptitude 
_^vocabulary J • 

Rearranging and to conform with this ordering we have 




.34} 



j 
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1412 


3 


-194 


-174 


1 


3 


1324 


-167 


107 


T 


-194 


-167 


1502 


-55 




-174 


107 


-55 


1110 




168 


-528 


-354 


-708 


II 


-528 


2094 


-141 


4164 


-354 


-141 


4362 


-4101 




-708 


4164 


-4101 


11634 



Vie perform forward Doolittles on each of and + M 







12876 


-501 


-2100 


-2274 










-501 


14010 


-1644 


5127 










-2100 


-1644 


17880 


-4596 










__-2274 


5127 


-4596 


'^1624 




• 


Doolittle on 


5M^ 












• 




5Me 










Check 


1412 


3 




-194 




-174 




1047 


1412 


3 




-194 




-174 




1047 


1 


0 


.002124 


-0. 


137393 


-0.123224 


0.741501 




1324 




-167 




107 




1267 




1323 


. 9936 


-166. 


5878 


107.3697 


1264.7755 




1 




0. 


125822 


0.081095 


0.955273 








1502 




-55 




1086 








1454. 


3853 


-65,3970 


1388.9880 








1 




-0,044965 


0.955034 












1110 
















1076.9104 





|M_| = -T- (1412)(1323.9936)(1454.3853)(1076.9104) 



0.468489 X 10 
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Doolittle on 45(Mj,+M^) 











Check 


12876 


-501 


-2100 -2274 




8001 


12 87 6 


-501 


-2100 -2274 




8001 


1 


-0.038909 


-0.163094 -0.176607 




0.621385 




14010 


-1644 5127 




16992 




13990.5066 


-1725.7101 5038.5199 




17303.3154 




1 


-0.123348 0.360138 




1.236789 






17880 -4596 




9540 






17324.6397 -4345.3809 




12979.2541 






1 -0.250820 




0.749179 






21624 








* 


16693.9248 








''li 


^2j 








diagonal terms 
Jooutt1#=‘ on M 


o.-' diagonal terms of 

Doolittle on 


F-ratio* d.f. 


3 = 1 


282.400 


286.133 


0 


.2379 2:36 


2 


264.799 


310.900 


3 


.0467 2:35 


3 


290.877 


384.992 


5 


.5004 2:34 


4 


215.382 


370.976 


11 


.9197 2:33 



(d« .-d. . )/degrees of freedom of 

*F-ratio = — ?J. ■ ?-2 ' ^ R 

dj^ Vl-j + degrees of freedom of 

Of these ratios, thv^ first and second (j = l,2) are not significant 
at the 5% level. The third, considered as a single test, would be 
significant at the 1% level (but not at the 0.5% level); the fourth 
figure is highly significant. 

It seems reasonable to infer that, within the range considered 
the performance in general science and arithmetic is relatively 
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unaffected by the level of I.Q.; however, the individual will 
perforin significantly differently in constructive aptitude and 
vocabulary. 

The j-th F- ratio measures the contribution to significance 
over and above the contribution of the 1st, 2nd, •••, (j-l)-th 
variable. The F-ratio of 0,2379 measures the contribution to 
significance of the difference between the first element in ‘^'he 
vector 



general science 


score 


arithmetic 


score 


constructive aptitude 


score 


vocabulary 


score 



between the I.Q. levels. The figure is so small that we elect 
to state chat no differences exist; an examination of the table 
of sums (third element in sub-totals for rows: 1129; 1137; 

1127) fortifies our belief in the non-existence of any difference. 
O^he F-ratio 3.0467 is not significant at the 5% level for single 
tests; this ratio corresponds to a comparison of the arithmetic 
scores for which the subtotals are 1115; 1101; 1138. The F-ratio 
5° 5004 (constructive aptitude scores) corresponds to the subtotal 
figures 1175; 1126; 1131. This last F-ratio is rather border-line 
and one may hesitate to assert that the difference is significant 
(bearing in mind that several tests have been made and that the 
data has been ordered the T-ratio” does not have exactly the 
F-distribution). The final F-ratio of 11*9197 corresponds to the 
vocabulary scores for which the row (I.Q.) subtotals are 1103; 
1119; 1186. We assert with some confidence that the vocabulary 
score differs significantly between the I.Q. groups within the 
range considered. 
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CHAPTER 13 

ANALYSIS or REGRESSION 



13.1 Introduction 

In the analysis of regression we lack the simplicity of com- 
putation usually found in the analysis of design; to compensate for 
this however is the fact that the so-called "design matrix" is of 
full rank, the result of which is tha": the square matrices of our 
normal equations do indeed have inverses. Me look in this chapter 
at two problems (which actually are essentially the same) 

(a) the curvilinear regression on a single concommitant 
variable 

(b) the multiple regression on several concommitant variables. 



13 . 2 Curvilinear regression (general case). 

Observed are n^ (mxD vector variates at time (or temperature 

or any other concommitant variable) t.. We denote the observations 
k n. ^ 

by {y . . } ^ so that k different temperatures are involved. 

i=i,j=i 



Let u. (t) be a polynomial in t of degree j (j=0,l,2,...); these 

ml 

polynomials are arbitrary except in special cases we will usually 
adopt the system =' . Our objective is to test the hypo- 
thesis that each element^ in the vector ^j) is a polynomial in 

t of degree no more than s. 

VJe may write our model: 




6^u^(t . )+3 tU, ( t . )+. 
—O O 1 —1 1 1 



.+B_u^(t . )+d (t . ) 

— s S 1 — S 1 



13.2.1 
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where the ^ 0.u. (t.) represents a general polynomial of degree 

s and where d^Ct^) represents a ’’departure from s-th degree 
polynomial.” The analysis table again has the striking resemb- 
lance to the univariate table, squares in the latter become the 
product of the column vector by its transpose for the latter. 

The table is: 

o w ^ . Degrees of 

Source Hatrix freedom 



Due to polynomial 
of degree s 



.1 iii"i 

i=l - ^ ^ 



s+1 



About Polynomial 
of degree s 



Error 



Total 






k n . 



i=l j=l ^ • 



k-s-1 



N-k 



k n . 

i = l jh 



N 






In the table is the best estimate of assumption 



that H is true and is obtained via 
o 



Y,- - 3^u^(t.) + 6TU,(t.)+ ... + 6^u^(t.) 
—1 "O O 1 —1 1 1 —S S 1 



where 



{6 } are the solutions of 

•Y Y = i 



y n.7* u (t.) 

ill Y 1 



I (t .)n.] 

j— JljJl 1 Y 1 1 



, n. 

y. = y^ y. . 
^1. n. ^13 



and 
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(N = 



Clearly the stickiest part of the analysis is the solution for 
s 

the {3 } ^ however this is accomplished routinely through a 

Doolittle computation. 

Let G be the matrix (mxs+1) whose y-th column is 

I ^ 

and A be the matrix (symmetric s+lxs+l) whose (ll,Y)“th element is 



y n.u„(t.)u (t.) ^ = 0, 1, 2, s 

Y »»» » 

^ _ A A A A 

If B is the (mxs+1) matrix B = ( then our system of 

A 

equc'tions on the written as 

G = BA 



with G and A known. The problem of finding B is dealt with in 

A A 

Chapter 10, Having got B, the are easy to calculate and thence 
the entries in the analysis table. 

Labelling the four matrices in the table, top to bottom., by 
^D* ^A* ^E* ^®spectiv ly then the test function is 




small values of are significant. 

To assess the significance of an experimental value of 



we set 



= k-s-1 
''2 = 



N-k 
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2 



48p 



(m^+v^-5) 



o 



1920p^ 



itr (3m‘*+3v!f+10in^v?-50n^-50v?+159) 



1920p 



and note that 




13.3 Curvilinear regression; no replication . 

In the case of no replication (n^ = l) , the "error matrix" of 
the previous section is identically zero; we need some kind of 
estimat^e of the common dispersion matrix of the To con- 

solidate ideas, let us set up our model again for this case of 
no replication; the suffix "j" is now dropped since j can only 
take the value 1. The model is 



If it should turn out that s is two small to represent the degree 



the "about polynomial of degree s" of the previous section would 
have proved significant; on the other hand, had the true degree 
of the polynomial been s or less then the "about polynomial of 
degree s" would itself have been a measure of the dispersion matrix 
of a vector observation. VJe use this last statement to deal with 




of the polynomial which expresses the behaviour of 




then 
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this unreplicated data. It is required to test the hypothesis 



not exceeding s. In preparing an analysis table it is to be 
remembered that H may be false in which case the degree of the 



polynomial will indeed be greater than s; in this event the 
’’about polynomial of degree s” will not be a measure of the 
common dispersion matrix but will be confounded with variations 
of the true polynomial about the fitted s-th order polynomial. 

V/e find ourselves forced to make some pronouncement con- 
cerning the true order of the polynomial and state (for some 

> 0) that we believe that the order of the regression polynomial 
is not in excess of s+A. For example, we may wish to test that 
the regression is quadratic (s=2) feeling that (H^) if a quad- 
ratic is inadequate then a cubic must surely be a polynomial of 
sufficiently high order to represent the data; we might feel 
safe in taking £=1 or to be ”on the safe side” it might be pre- 
ferred that Z be set equal to 3 As in the previous section, 
let G be the (m^V+l) matrix whose y-th column is: 



then if B is the (mxs+1) matrix B = (3^, Bn, •«., 3„), then B is 

mmQ ^ ^ mmQ ^ 

the solution of (see Chapter 10) 



(H_) that the elements of 



o 




polynomial of order 




Y - 0, 1, •••, s 



and A be the matrix (symmetric, s+lxs+1) whose (6,y)-th element is 




G = BA 
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If now 



y. = 3_u (t.) + B^u, (t.) + ... + B_u (t.) 

•■1 — O O 1 —1 1 1 “S s 1 

k . . 

then y y*y* is the "due to polynomial of degree s" and 
i " 

k . . 

I “Z* ^ “il* ^ ^ measure of the variation not acco"nted for 

i=l ^ ^ ^ 

by the s-th order polynomial; this will be a measure of if and 

only if H is true, 
o 

Let now G* be the (mxs+ H+1) matrix whose Y“"tb column is 






Y = 0, 1, 2, ..., s+i 



so that G is the first s+1 columns of G* and let A be the matrix 



(symmetric, s^+H+lxs+i+i) whose (6,Y)-th element is 



Ju,(ti)u (t.) 



€ 



= 0 , 1 , 2 , . 



• • > 



s+H 



then if B* is the (mxs+Jl+l) matrix, B* = •••» 

A 

then B’* is the solution (see Chapter 10) 



G = B*A . 



Now set 






then i) is a measure of the variation of the 

about a polynomial of degree s+H. If our choice of i is not too 
small, this should be purely random fluctuation. 

Our analysis table is: 
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Source 

Due to polynomial 
of degree s. 



Matrix 



A A 



Degrees of Freedom 



s+1 



Gain in fitting polynomial 
of degree s+Jl 



A A A A 



About rolyuomial of 
degree s+A 

Total 



hai 



k-Jl-s-1 

k 



If the matrices listed above are, top to buttom Cj^, and 



Crj, then Cj) + Cq + = C^. Our test function is 






|cj 



.1 Cq+c^I 



and to assess the significance of , we set 



V, = £ 



= k-Jl-s-1 



IT 



\)^ - -^(m+l-v^) 



mv 2 

— i*. (m^+v;-5) 
48P^ 



mv 



IT, 



1920p 



(3m^+10m^v^+3v5[-50(m^+v^) + 159) 



then 



Pr(-p^log 



= (l-iT2-ir^-t2)Pr(x^^^<T) + (t2-it2)Pr(x^^^+4<T) 



+ (itj|+t2)Pr(x^y +8'*''^'''® term of order p“°. 






o 
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CHAPTER 14 



TEST OF HYPOTHESIS ON THE DISPERSION MATRIX 



14.1 Introduction 

We have seen in the preceding chapters how tests of hypothesis 
in the multivariate' case are directly related to the corresponding 
test in the univariate case; the only essential difference is the 
distribution of the test criterion. 

In the univariate case we have only a single (unknown) para- 
meter, a, expressing the standard deviation of the random error; 
in the multivariate case we have the array of parameters in V, 
the dispersion matrix of the vector observation x» As a conse- 
quence we have the possibility of hypothesis concerning the ele- 
ments of V which do not h"ve a counterpart in the univariate 
methods and theory. In this chapter we shall be concerned with a 
test of independence between specified groups of elements of the 
observed variate x. 

14.2 The intra-independence of elements of x . 

Let us assume we have available the sequence of observations 

— 1* — 2* randomly independently drawn from a normal popula- 

tion (or set of populations) with dispersion matrix (or dispersion 
matrices all equal to) V. For any given model for the mean vectors, 
^p(Xj) we can construct an error matrix, say. Examples of such 
a construction are given in earlier chapters ; in Chapter 12^ M^ 
would be the error matrix in the cross-classification. If ^ " 

1 ^, j = l,...,n, so that 2i2* * * * * — n drawn from the same 

population (that is, same mean, same dispersion matrix) then 
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n 



= I (5C.-X )(X.-X )* 

E -;=i “3 “• -3 -• 



(v=n-l below) 



— 1 r 

where 5^ = rr X • 

— n V —T 

3 

Suppose the degrees of freedom assoc? ated with the error matrix 
is V (v= rc(n-l) in the example in Chapter 12), then we may require 
to test 



H : V . . = 0 



» ‘ 



. • = 1 , 2 , 



m 



o 13 

that is all off diagonal elements of V are zero. If x has the 
multivariate normal density the V diagonal is a necessary and 
sufficient condition for the mutual independence of all elements 

of X. 

The appropriate likelihood ratio test function is monotonic 
in 

I 




where (M^) . . is the j-th diagonal element of M^. The null distri- 
bution of this criterion is unobtainable except for m=2 when 

I 

reduces to essentially the well known test for independence be- 




is i 



tween two variates x^ and X 2 (that i 



For values of m exceeding two, we resort to an approximation 
to obtain the significance level of an observed value of With 



f = ^(m-1) 



p = V - 

^o 



2m+5 
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then Pr[-p 




T ] 



PrCXf 



> 



t] + 0( 




t 



By way of illustration we take the error matrix of the 
example in Chapter 12. [It is noted there that the off-diagonal 
elements in are small compared with the diagonal terms, sug- 
gesting the possibility of mutual independence. 

Now |Mp| = 0.468489 x 10^° 



and 



so that 
Now small 



I 



3=1 



= 0.498696 x 10 

h OD 



10 



c^^= 0.9394; f=6; = 203/6 . 

values of e^^are significant and 
Pr[ 0.9394] = PrElog c^< -0,0625] 

= Pr[-p^log > 2 . 11 ] = PrCx? > 2.11] 



= 0.92 

There is every evidence therefore that the hypothesis of mutual 

independence of the scores discussed in Chapter 12 can be held 

/ 

to be true. 

14.3 The inter-independence of two sets of variates . 

Again we assume available the sequence of observations x^, 
—2*****^ i^sndomly and independently drawn from a normal popula- 
tion (or populations) such that the error matrix (H^) has v de- 
grees of freedom. For convenience we shall drop the suffix E to 
in this section writing simply M for the error matrix. If x 
is partitioned into two sets of elements X(i) —(2) that 
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X = 



-( 1 ) 



-( 2 ) 



with p elements in the vector X(i) ^ elements in the 

vector 2£(2)* required to test that every element of 

Xq^ is independent of every element of ^( 2 ) still allow- 

ing that there may be intra-dependence in either or both partition 
vectors). If the dispersion matrix is V and we partiiton V as: 



11 



V 



12 



V = 



^21 i ^22. 



with a p;<p matrix and V 22 ^ matrix then the hypothesis 

of inter-independence can be statied as: 



^^o* ^12 ~ ^^12 ^ niatrix of all zeros) 



If M be partitioned in the same way as was V, that is 



M 



11 



M = 



M 



M 



12 



21 



M 



M 

M 



is p? 4 p; p+q = m 



22 



22 



qxq 



then our test criterion is 




Mill '^'^22 

The distribution of ^^^independent upon p and q and we have a 
readily available method of getting a critical region for 
p or q is either 1 or 2 ; otherwise we resort to a very adequate 
approximation • 

Case p=l. (q=m-l). 



- ^ 



V- 




is distributed as (Is^rge values signifi- 



cant ) . 
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Case q=l. (p=m-l). 

“-p 

Case p=2. (q=m-2). 

0 

Case q=2. (p=m-2). 



« is distributed as i 



p; v-p 



(large values 
significant) . 



is distributed as ^’2q : 2 ( v-q-1) * 

values significant). 







X 

p 



is distributed as ^2p • 2 ( v-p-1) “ 

values significant). 



If both p and q exceed 2 then we set 

m+1 



Po = '' 



TT« = (p^+q^-5) 

48p2 



TT = — E2_ (3(p^+q^) + 10p^q^-50(p^+q^) + 159) 
^ 1920p^ 



then 



Pr[-p^log 

o 







+ (i.2-7T^)Pr[Xpq+4>t] + 

+ terms of order p”^ . 

o 

14.4 Equality of a number of dispersion matrices . 

Suppose from each of k populations we have available an esti- 
mate of the dispersion matrix based on an "error-matrix” C^ with, 
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say, degrees of freedom (t=l ,2 , . • . ,k) . The estimate of V^, the 

dispersion matrix for the t-th population would be therefore “C.. 

t ‘ 

It is required to test 

'^1 ■ '^2 “ \ 

The likelihood ratio test currently used is that derivable 

k 

from the joint density of the iC^> rather than from the original 

t = l 

(normally distributed) variates. This likelihood ratio is u where 
log u = (^|v^)log( 

The distribution of u, or actually log u, can be approximated to 



by a X -density. If we define 



r 



Po = 1 - 

and define y i>ys 

2 



2m t3m-l 
6(m+l)(k-l) 



1 



: i - <K> 



-1 



48p^ = m(m -l)(m+2‘ 






-6m(m+l)Ck-l) ( 1-p ) 

o 



then with 

f = -ImCm+l) (k-1) , 
it can be shown that 

Pr -2 logu > T = (l-Y)Pr{x^ > t}+ vPr{x £+4 > x) 

— 3 

+ terms of order • 

Small values of u (and therefore large values of -2plogu) are 
significant of departures from H^. 
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CHAPTER 15 

LINEAR DISCRIMINANTS 

15 «1 Introduction ' 

The use of linear discriminants is quite wide spread in the 
problem of "reducing the dimension of the data.” Suppose it is 
intended to make a detailed survey of, say, the general build and 
physical development of 15 year old male children according to 
their environment; urban, suburban, and rural. There are a large 
number of measurements which can be made on the human body: height; 

V eight; waist measure, chest measure (exhaled and inhaled); length 
of leg, arm; distance around neck, head, calf, thigh; shoulder width 
and so on. Each of these measurements is an indication of build and 
physical development which may vary according to t.he environment* 
However, for obvious reasons, it is undesirable to amass a super- 
of measurements on a large group of individuals. 

There are basically, two reasons why a particular measurement 
could be excluded from consideration: 

it is so highly related to another observation that it 
contributes little to our knowledge in the presence of 
this other variable. 

As an example, shoulder width and chest girth are highly correlated 
variables. A casual glance ?t the human race is sufficient to indi- 
cate that the arms hang down from the shoulders touching the side 
of the rib cage in almost all cases so that we would intuitively 
guess that this high correlation exists. However, other variables 
(height and waist) have low correlation; it is noticeable that 
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people of the same height differ quite considerable in their gen- 
eral build and in particular in the waist measurement. 

(ii) a variable should also be considered for exclusion if 

it does not differ with, in our case, different environ- 
ment, or if it differs so little as to be of little use 
in our investigation. 

As an example, many measurements on the skeleton (particularly 
the skull) are no more variable across categories (suburban, urban, 
rural) than they are within each category and therefore, are un- 
likely to contribute much to our understanding of variations across 
these categories. 

Rolf E. Bargmann (Part III of this contract) has given a de- 
tailed account of the use and application of discriminant functions 
so that in this volume we give only a resume. 

15.2 Selection of significant variables. 



Suppose we have k groups with, say, n. vector observations in 

2L 



the i-th group: 



Group I 



Group II 



Group i 



Group k 



X 



X. 



~11*-12*‘ • • *-ln^ i£21»-22*‘ • • *-2n2 



—il *?li2 * * * * 



*— in . 

1 



~kl*-k2»*“»-kni 



and we shall further suppose that each vectDr observation contains m 
e'*ements, (possibly the anthropometric measureinents discussed in 
section 15.1). The expectation of most elements of x. . will change 
with group 5 some however (skull measurements perhaps) will not. 

Rather than work with a vector of observations we choose to 
work with a scalar observation defined as a linear combination of 
the elements of the vector. Thus if Xj^. is the j-th replicate in 

ml 



group 1 , we construct 
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for some, at the moment general The observation in group i 

are now a set of n. independent scalar observa- 

ii iz in . 1 

1 

tions • 



Using the standard univariate techniques, we have 



Source 


Sum of Squares 


Degrees of Freedom 


Between group 


l(z. -z )^n. 

1 ^ ^ ^ ^ 


k-1 






k 


Within group 


y y (z. .-z. 

ID 13 1* 


I (Hi-l) 
1=1 



where 



= — y z . . 

1 . n. V 1 ] 



i~l,2 . . ,]c 



1 D 



I I 



1 D 



ID 



I 



n . 

1 



Our usual test statistic expressing a difference between the groups is 






1 „ - - .2 
CT I ^^i.- "i 






nr 11 



referred to F-tables, degrees of freedom {k-1; ^(n^-1)}. 

1 

The choice of a is arbitrary at the moment but it seems reason- 
able to select that value of ^ which maximizes ^ . It is easily 
shown that must satisfy 



(H-©E)o^.^ = 




0 
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where 






1 D 



■ID 



(the familiar "error matrix") 



and 



H = Y(x. -X )(x. -X )* 

y **1 • “f • • 



(the familiar "hypothesis matrix"). 
0 is the maximal solution of |II-0E| 



= 0 and a the associated eigen 



vector. 



Of course >jith so selected the "between group" an- "within 
group" sums of squares no longer have x -distributions. However, 
we may say that is the linear function of the elements 

of X • • which discriminates best between the groups. If 

•—1 3 

Vij 

2’‘ij 

• 

X. • 

m 13 

then we construct a table of correlations between 



-ij 



z • • an d _ X • • , 

1] q 1] ’ 



q “ 1, 2, ..., m 



that is, we compute 



r 

q 



^(z.j-z.^)CqXij-qX.^) 



(|(Zi^-Zi )^) <J<qX£j-qX^ )^) 



the inference is that a high value of r^ indicates that the elements 

X.. are "almost as good" as z.. in discriminating between the groups 
q ID 13 



o 
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The set of small correlations are noted (r^ ^0.25 sat) and it may 
be decided not to make the corresponding measure ® large 

scale experiment. 

This approach, it is admitted, is somewhat lacking in dis- 
tributional justification; howe/er, it does give us some idea of the 

n 

relative roles of the elements { x..} in respect of their dif- 

<1 >■-■> qsi 

ferences across the groups. Having decided that certain elements 
are of ‘ the non-contributing type, it is as well to use the step-down 
approach (section 12.3) as a check. 






